Skip to content

Conversation

rbagd
Copy link
Contributor

@rbagd rbagd commented Aug 13, 2025

Description

Reference to a task identifier as a string is kept indefinitely by the instrumentation. This just makes sure it gets removed once no longer needed.

Related #3458

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

  • Unit test to ensure reference is not there

Does This PR Require a Core Repo Change?

  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@rbagd rbagd changed the title Do not keep task id reference indefinitely Do not keep task id reference indefinitely in Celery instrumenation Aug 13, 2025
@rbagd rbagd requested a review from a team as a code owner August 13, 2025 08:18
task_runtime_estimated = (default_timer() - start_time) * 1000

metrics = self.get_metrics()
self.assertEqual(CeleryInstrumentor().task_id_to_start_time, {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should another test be added where the task_id does not exist and the pop operation returns None instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that test matter actually? We're not testing return value of the pop here in any case. I may not see your point. Can you maybe explain the case you have in mind?

@xrmx xrmx moved this to Ready for review in @xrmx's Python PR digest Aug 22, 2025
self.task_id_to_start_time.get(task_id),
attributes=metric_attributes,
)
self.task_id_to_start_time.pop(task_id, None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is that we are leaking? On line 355 we are setting this a time object so don't expect to keep alive something on the celery side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

task_id_to_start_time is a dictionary initialised with CeleryInstrumentor. Each time a task finishes the dictionary is augmented with a record of time spent for that task id. In the current state the dictionary just keeps growing even though the task finished long time ago (task identifiers are normally unique). There are no references to any other Python objects that prevent GC, it's just this relatively small issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Ready for review

Development

Successfully merging this pull request may close these issues.

3 participants